# Replication code and data for "Missing data, Speculative Reading"

This repository provides replication code and data for **Missing Data, Speculative Reading** article.

## Contents
- [data](data) - data specific to this article and source data from the [Shakepeare and Company Project](https://shakespeareandco.princeton.edu/)
- [missing_data](missing_data) - code notebooks for the missing data portion of the article
- [speculative_reading](speculative_reading) - code notebooks for the speculative reading portion of the article
- [appendix](appendix) - additional notebooks with validation, alternate approaches, etc; work that did not make it into the article
- [figures](figures) - exported versions of figures for the article generated by code in multiple formats where supported
- [utils](utils) - utility python code used by multiple notebooks

## Installing dependencies and running code

This code has been tested against **python 3.9**.

To run the code, first clone or download the repository.

Python dependencies are documented in `requirements.txt`. We recommend using
a python virtual environment. Dependencies can be installed with pip:

```sh
pip install -r requirements.lock
```

### Testing

There are unit tests for some utility code, which include checks that data files
are available at the expected locations. To run them, install and run pytest:

```sh
pip install pytest
pytest
```

Code notebooks can be run using jupyter-lab or a jupyter-aware IDE such as VS Code.
They are intended to be run locally, with dependencies installed.

We use [treon](https://github.com/ReviewNB/treon) (`pip install treon`) to
confirm that Jupyter notebooks execute without errors.
